43 research outputs found
AirSim: High-Fidelity Visual and Physical Simulation for Autonomous Vehicles
Developing and testing algorithms for autonomous vehicles in real world is an
expensive and time consuming process. Also, in order to utilize recent advances
in machine intelligence and deep learning we need to collect a large amount of
annotated training data in a variety of conditions and environments. We present
a new simulator built on Unreal Engine that offers physically and visually
realistic simulations for both of these goals. Our simulator includes a physics
engine that can operate at a high frequency for real-time hardware-in-the-loop
(HITL) simulations with support for popular protocols (e.g. MavLink). The
simulator is designed from the ground up to be extensible to accommodate new
types of vehicles, hardware platforms and software protocols. In addition, the
modular design enables various components to be easily usable independently in
other projects. We demonstrate the simulator by first implementing a quadrotor
as an autonomous vehicle and then experimentally comparing the software
components with real-world flights.Comment: Accepted for Field and Service Robotics conference 2017 (FSR 2017
Learning Anytime Predictions in Neural Networks via Adaptive Loss Balancing
This work considers the trade-off between accuracy and test-time
computational cost of deep neural networks (DNNs) via \emph{anytime}
predictions from auxiliary predictions. Specifically, we optimize auxiliary
losses jointly in an \emph{adaptive} weighted sum, where the weights are
inversely proportional to average of each loss. Intuitively, this balances the
losses to have the same scale. We demonstrate theoretical considerations that
motivate this approach from multiple viewpoints, including connecting it to
optimizing the geometric mean of the expectation of each loss, an objective
that ignores the scale of losses. Experimentally, the adaptive weights induce
more competitive anytime predictions on multiple recognition data-sets and
models than non-adaptive approaches including weighing all losses equally. In
particular, anytime neural networks (ANNs) can achieve the same accuracy faster
using adaptive weights on a small network than using static constant weights on
a large one. For problems with high performance saturation, we also show a
sequence of exponentially deepening ANNscan achieve near-optimal anytime
results at any budget, at the cost of a const fraction of extra computation
Flight Dynamics-based Recovery of a UAV Trajectory using Ground Cameras
We propose a new method to estimate the 6-dof trajectory of a flying object
such as a quadrotor UAV within a 3D airspace monitored using multiple fixed
ground cameras. It is based on a new structure from motion formulation for the
3D reconstruction of a single moving point with known motion dynamics. Our main
contribution is a new bundle adjustment procedure which in addition to
optimizing the camera poses, regularizes the point trajectory using a prior
based on motion dynamics (or specifically flight dynamics). Furthermore, we can
infer the underlying control input sent to the UAV's autopilot that determined
its flight trajectory.
Our method requires neither perfect single-view tracking nor appearance
matching across views. For robustness, we allow the tracker to generate
multiple detections per frame in each video. The true detections and the data
association across videos is estimated using robust multi-view triangulation
and subsequently refined during our bundle adjustment procedure. Quantitative
evaluation on simulated data and experiments on real videos from indoor and
outdoor scenes demonstrates the effectiveness of our method
Adaptive Information Gathering via Imitation Learning
In the adaptive information gathering problem, a policy is required to select
an informative sensing location using the history of measurements acquired thus
far. While there is an extensive amount of prior work investigating effective
practical approximations using variants of Shannon's entropy, the efficacy of
such policies heavily depends on the geometric distribution of objects in the
world. On the other hand, the principled approach of employing online POMDP
solvers is rendered impractical by the need to explicitly sample online from a
posterior distribution of world maps.
We present a novel data-driven imitation learning framework to efficiently
train information gathering policies. The policy imitates a clairvoyant oracle
- an oracle that at train time has full knowledge about the world map and can
compute maximally informative sensing locations. We analyze the learnt policy
by showing that offline imitation of a clairvoyant oracle is implicitly
equivalent to online oracle execution in conjunction with posterior sampling.
This observation allows us to obtain powerful near-optimality guarantees for
information gathering problems possessing an adaptive sub-modularity property.
As demonstrated on a spectrum of 2D and 3D exploration problems, the trained
policies enjoy the best of both worlds - they adapt to different world map
distributions while being computationally inexpensive to evaluate.Comment: Robotics Science and Systems, 201
Discovering Blind Spots in Reinforcement Learning
Agents trained in simulation may make errors in the real world due to
mismatches between training and execution environments. These mistakes can be
dangerous and difficult to discover because the agent cannot predict them a
priori. We propose using oracle feedback to learn a predictive model of these
blind spots to reduce costly errors in real-world applications. We focus on
blind spots in reinforcement learning (RL) that occur due to incomplete state
representation: The agent does not have the appropriate features to represent
the true state of the world and thus cannot distinguish among numerous states.
We formalize the problem of discovering blind spots in RL as a noisy supervised
learning problem with class imbalance. We learn models to predict blind spots
in unseen regions of the state space by combining techniques for label
aggregation, calibration, and supervised learning. The models take into
consideration noise emerging from different forms of oracle feedback, including
demonstrations and corrections. We evaluate our approach on two domains and
show that it achieves higher predictive performance than baseline methods, and
that the learned model can be used to selectively query an oracle at execution
time to prevent errors. We also empirically analyze the biases of various
feedback types and how they influence the discovery of blind spots.Comment: To appear at AAMAS 201